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This paper presents a comparative study of EEG-based multiclass motor 
imagery classifiers based on Kullback-Leiber regularised Riemann Mean and 
support vector machine, hybrid one versus one classifier, linear discriminant 
analysis, and convolutional neural network. The paper is felt to be of inter- 
est to those researchers working in the motor imagery classification of EEG 


EEG: signals. The work presented in this paper helps to understand the basics of 


Motorimagery 
of channels involved. 


1. Introduction 


In Brain-computer interfacing (BCI), Motor 
imagery (MI) is a process in which the human 
brain imagines that a person is performing a 
movement without actual involvement of peripheral 
nerves and muscles and without even tensioning 
the muscles. MI-based BCI is an independent 
system with higher classification accuracy. The 
BCI helps to convert inputs from the brain into 
commands or directives due to the user’s desire and 
sends them to external devices such as computers 
or prostheses. Among various non-invasive BCI 
methods, electroencephalography (EEG) is one of 
the best methods to record or test brain activity due 
to its excellent time resolution and portability and 
requires less expensive equipment. Therefore, it is 
more convenient and practical to use EEG signals 
as input in BCI systems (Wang, S. Gao, and X. 
Gao). The BCI system consists of 3 components- 
input signal, processing unit, and control command. 
The processing unit takes EEG signals as input 
signals. The bioelectric signals resulting from 
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different multi-class motor imagery classifiers, their accuracy, and the number 


electrical activity in the brain are captured by 
EEG equipment (Safitri, Djamal, and Nugraha). 
These bioelectric signals captured by EEG signals 
need to be classified using a suitable classifier for 
further study/processing. Classifiers are helpful 
because they enable us to decide whether a left 
(or right) hand movement or left (or right) foot 
movement command is initiated. Therefore, the 
classifier plays a vital role in motor imagery signal 
classification (Thang and Temiyasathit Sharbaf, 
Fallah, and Rashidi Du, Liu, and Tian). To improve 
classification accuracy, the feature extraction tech- 
nique is crucial. Feature extraction strategies are 
critical for enhancing MI signal classification rates. 


In this paper, we have compared EEG-based MI 
classification methods and found ways to give good 
performance in terms of accuracy, channel count, 
and complexity. The organization of the paper is 
as follows. In section 2, classification methods that 
have been considered for the study have been pre- 
sented. Section 3 is dedicated to the observations 
and discussion. Finally, section 4 is the conclusion 


306 


Comparison of multi-class motor imagery classification methods for EEG signals 


section of the paper. 


2. Methods for Classification 


In this section, we have presented four classi- 
fier methods, namely Kullback-Leiber regularised 
Riemann mean (KLRRM) (Mishra et al.) in 
combination with linear support vector machine 
(SVM), Naive Bayes (NB) (Sharbaf, Fallah, and 
Rashidi), linear discriminant analysis (LDA) (Thang 
and Temiyasathit), convolutional neural network 
(CNN) (Du, Liu, and Tian). Each method has its 
advantages, and some are better than others in terms 
of channel count used, complexity, the accuracy of 
classifiers, etc. The details of the classifiers are as 
follows: 


2.1. Linear Discriminant Analysis (LDA) 


LDA is a linear classifier commonly used to classify 
linearly separable data. In LDA, nominal statistics 
maximize the likelihood of discrimination between 
two classes. LDA can also project high-dimensional 
data onto a low-dimensionality feature space (Kim, 
S.-K. Lee, and B. Lee). Over the past decades, 
LDA has been extensively used to reduce dimen- 
sionality, recognize patterns, and classify data. LDA 
has been used by Thang and Temiyasathit (Thang 
and Temiyasathit) to boost the accuracy of sig- 
nal categorization in BCI by using the regulariz- 
ing multi-bands common spatial patterns approach 
(RMCSP). Using a high number of channels as 
recording devices restricts the BCI system. RMCSP 
is developed to use EEG for research signals with 
fewer channels. Five FIR filters were used to filter 
the EEG data into five distinct frequency sections. 
The RMCSP technique’s operation has two steps, as 
shown in Fig 1. 

a) In the first step, five FIR filters were used to 
span five different frequency bands, and spectral 
characteristics that characterize event-related syn- 
chronous events from the brain were extracted using 
these filters. 

b) The second step learns spatial patterns to dif- 
ferent spectral data by regularizing common spatial 
patterns. The one versus rest (OVR) CSP approach 
is used in this strategy. 

The RMCSP filters log variances of features were 
employed to input LDA, which is used as the clas- 
sifier. The output is then combined from the four- 
class classifiers via a voting tactic based on the 
majority, which assigns the class label given to the 
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classifier with the highest likelihood (Thang and 
Temiyasathit). 


2.2. Naive Bayes (NB) Classifier 


NB classifier consists of a group of classification 
methods and is based on the Bayes theorem. The 
Bayes theorem determines the probability of a sub- 
sequent event based on the probability of a previous 
event. Bayesian classifiers work on the basic princi- 
ple of probabilistic classification. NB has been used 
by Sharbaf et al. (Sharbaf, Fallah, and Rashidi). The 
authors recorded the EEG from 22 channels follow- 
ing the 10-20 international system with a sampling 
frequency of 250 Hz. These signals were band pass 
filtered within the 0.5 Hz -100 Hz (also used a 50 Hz 
Notch filter). The steps involved in the implementa- 
tion of NB classifiers (Sharbaf, Fallah, and Rashidi) 
are as follows: 

a) Following the least-square linear-phase filter, 
the signals were filtered. 

b) As part of signal processing, a common spatial 
pattern (CSP) was used to distinguish between two 
signals based on differences in variance between 
them. The common spatio-spectral patterns (CSSPs) 
were used to embed an FIR filter into a spatial filter, 
and thus new channels were defined for delayed sig- 
nals. 

c) In shrinkage estimation for covariance matrix 
estimation, an estimate is made to minimize the 
mean square error by regularizing the covariance 
matrix. This method overcomes the disadvantages 
of conventional covariance matrix estimation, such 
as CSSP and CSP. 

d) Mutual information best individual features 
(MIBIF) are used to select the relevant features. 

e) One vs. one uses multiple classifiers in the 
N number class classification; each classifier distin- 
guishes from one class to another. 

f) To specify the trial’s label if three classes 
win equally, the combined OVO extracted charac- 
teristics are accustomed to creating an NB classi- 
fier (Sharbaf, Fallah, and Rashidi). 

g) Six linear SVM and four NB models have been 
used for four class classifications without ambiguity. 
The architecture is shown in Fig 2. 


2.3. Shallow Convolution Network Architecture 


For MI task detection and classification, researchers 
began to use deep learning techniques like CNN, 
which outperformed other traditional approaches. A 
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FIGURE 1. Regularizing multi-band CSP architecture (Thang and Temiyasathit) 
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FIGURE 2. Hybrid architecture of OVO and NB classifier (Sharbaf, Fallah, and Rashidi) 
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FIGURE 3. The novel CNN architecture (Du, Liu, and Tian) 


shallow CNN architecture (Du, Liu, and Tian) was 
used with a unique signal-superposed data augmen- 
tation strategy to improve classification accuracy. 
The shallow CNN architecture (Du, Liu, and Tian) 
(Fig. 3) consists of three convolutional layers and 
four fully linked layers. The data augmentation 
method of superposing and normalizing the signals 
of the same labels across people and time is used 
to generate new artificial EEG data. This superim- 
posed data augmentation strategy can help signals 
retain their intrinsic properties while also reducing 
signal drift over time and among patients. The clas- 
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sification result of the shallow CNN design is bet- 
ter than the preceding architectures, with an aver- 
age accuracy of 91.06% for two-class classification 
tasks. The subject, when imagined moving any part 
of their body, and EEG data were recorded. Data 
augmentation was used to create more training data 
using a deep learning model in the training process. 
The author performed the following steps: 


a) Transforming the real data is done by shift- 
ing, scaling, and rotating it. To tackle the problem 
of data scarcity, fresh data are generated artificially 
from existing training data. This technique is called 
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data augmentation. 

b) Four fully connected (FC) layers and three con- 
volutional layers compose the novel CNN architec- 
ture. 

c) For each EEG channel, the first layer conducts 
a linear pre-filtering as a function along the time 
axis. 

d) By performing convolution along the axis of 
the EEG channel, the second layer can turn down 
the effects of the realm unrelated to movements. 

e) The next layer provides the most robust archi- 
tecture of all the layers. 

f) Three layers are applied, linked after the con- 
volutional layers, with the first FC layer containing 
approximately 6300 neurons. The last FC layer is 
the softmax layer, with the input being the data’s 
total number of neurons to categorize (Du, Liu, and 
Tian). 


2.4. Kullback-Leibler Regularized Riemannian 
(KLRRM mean and linear SVM 


Feature extraction is more robust against noise and 
outliers by using Kullback-Leibler regularization. 
With KLRRM-based feature extraction, the classi- 
fication accuracy is improved for almost all sub- 
jects. KLRRM and LSVM frameworks combined to 
achieve the highest accuracy for four subjects. Lin- 
ear SVM (LSVM) is employed to categorize the data 
after calculating the distances to the Riemannian 
mean of all four classes. Mishra et al. (Mishra et al.) 
used this method to classify four class MI signals. A 
highly precise analog-to-digital converter with a 250 
Hz sampling rate is used for digitizing the analog 
EEG signals. The authors adopted the methodology 
described below and also shown in Fig. 4. 

a) Butterworth bandpass filter of sixth-order was 
used to filter the MI signals in the 8-30 Hz frequency 
range. 

b) Feature extraction by using the KLRRM 
method was performed to improve the classification 
accuracy. 

c) For all four MI classes, the Riemannian mean 
is derived based on regularisation in order to make 
feature extraction resilient against outliers. 

d) Using the one vs. another mechanism of multi- 
class classification, the LSVM is trained. The sub- 
test set’s performance is examined using a trained 
LSVM and a regularized Riemannian mean matrix. 

e) A similar procedure is repeated for all possible 
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values of 6 (regularization factor), and the one with 
the greatest precision is considered ideal for the sub- 
ject of interest. After that, the validation set is used 
to test the optimal beta (Q and Temiyasathit). 


3. Discussion 


In this paper, we have reviewed various EEG-based 
multi-class MI classifiers like LDA, LSVM, CNN, 
and hybrid one vs. one (OVO). The performance 
comparison of these classifiers is summarized in 
Table 1. The limitation of conventional OVO is 
when more than one class is labeled as the trial 
label. This condition arises when multiple classes 
have almost equal chances when compared to other 
classes. The hybrid OVO classifier system was pro- 
posed to overcome this type of limitation. The novel 
shallow CNN architecture was proposed to over- 
come the limitation of conventional CNN architec- 
ture as the conventional architecture is suitable for 2 
class classification only, but the Novel CNN archi- 
tecture is suitable for four-class classification and 
more suitable for real-time brain-computer interface 
(BCD) systems and better than some of the traditional 
machine learning-based approaches. The limitation 
of the BCI system is that it uses a large number of 
channels that are used as recording devices. RMCSP 
is designed in such a way that it can handle EEG sig- 
nals with a smaller number of channels. LDA shows 
classification accuracy better than conventional CSP 
by 10%. KLRRM framework provides good accu- 
racy for poor or noisy channels, which shows its 
robustness towards noise and outliers, and at the 
same time, it can maintain the accuracy of good 
subjects also. The KLRRM and LSVM mechanism 
provides better performance for both good and poor 
subjects. 


4. CONCLUSION 


One of the main reasons for the high misclassifica- 
tion rate is noise in the EEG data. Classification 
of MI signals is a very delicate and complex pro- 
cess because the intervention of noise and outliers 
are also there, which makes the classification pro- 
cess more vulnerable (Brigham and Kumar Zanini et 
al.). KLRRM and LSVM are one of the best meth- 
ods which take into account the effect of noise and 
outliers. According to Mishra et al. (Mishra et al.), 
for the four-class classification method, an accuracy 
of 74.73% was provided for channels that are not 
affected by noise and 51.53% accuracy for noise- 
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FIGURE 4. Architecture of KLRRM and LSVM (Mishra et al.) 


TABLE 1. Performance comparison 


Author Name Pre-processing + feature Number Accuracy 
extraction+ classifier of chan- 
nels used 
Thang and FIR filters + Regularising CSP 22 Outperform normal CSP by 
Temiyasathit (Thang +OVRCSP+LDA 10% 


and Temiyasathit) 
Sharbaf et al. (Sharbaf, Fal- 


FIR filters + CSP+ CSSP+ 22 


Improvement in kappa score 


lah, and Rashid1) MIBIF+ hybrid OVO to 0.61 
Du et al. (Du, Liu, and Data augmentation+ CNN 16 average cross-validation 
Tian) accuracy of (global) 66.73%, 
(subject model) 76.78% [for 
4 class classification] 
Mishra et al (Mishra et al.) Bandpass filter + 22 74.43% and 51.53% for both 
KLRRM+MDRM+LSVM good and noisy channels. 


dominating channels which is quite a good result 
when we are dealing with such delicate MI signals 
where classifiers play a vital role in the overall pro- 
cess. The four-class classification method is quite 
promising and implementable. This method can be 
made more advanced and accurate for a real-time MI 
classifier. 
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